Data Visualization in R‍💻

Oluwafemi Oyedele

2024-06-30

Agenda

  • Introduction to the Grammar of Graphics

  • We will be looking at the different layers that is in ggplot2

  • The focus of this talk will be on the 20% that is useful 80% of the time

  • My goal is to make you excited about ggplot2!

  • I will entertain questions at the end

Packages Used Today

  • This workshop focuses on data visualization with ggplot2.

  • ggplot2 is a system for declaratively creating graphics, based on The Grammar of Graphics. You provide the data, tell ggplot2 how to map variables to aesthetics, what graphical primitives to use (geoms), and it takes care of the details.

library(ggplot2) 

Grammar of Graphics

  • First published in 1999

  • A theoretical deconstruction of data graphics

  • Foundation for many graphic applications

  • The Grammar of Graphics can be applied to every type of plot

  • Concisely describe components

Grammar of Graphics

Grammar of Graphics

  • Your dataset

  • Tidy format

  • There is no visualization without a dataset

Grammar of Graphics

  • Aesthetics mapping: links variable in the data to graphical properties in the geometry.

  • We can specify the following properties within the aestetic mapping (colour, shape, alpha, fill, size).

Grammar of Graphics

  • Transform input variables to displayed values:

    • Bins for histogram

    • Summary statistics for boxplot using stat_boxplot()

    • No. of observations in a category for bar chart stat_count

  • Even tidy data may need some transformation

  • The statistics is linked to the geometry

Grammar of Graphics

  • Scales help us to control the mapping from data to aesthetics

  • Scales also provide the tools that let you interpret the plot: the axes and legends.

  • Scales are automatically generated in ggplot and can be customized

    • log scale

    • We can also specify limit within the scale

  • Scales help you interpret the plot

Grammar of Graphics

  • Geometries help us to interpret the aesthetics as graphical representation

  • Determines your plot type

    • bar chart geom_bar()
    • scatter geom_point()
    • box plot geom_boxplot()
    • histogram geom_histogram()

Grammar of Graphics

  • Divide your data into panels using one or two groups

  • Allows you to look at smaller subsets of data

Grammar of Graphics

  • A coordinate system, maps the position of objects onto the plane of the plot.

  • It is also the physical mapping of the aesthetics to the paper

  • Coordinate systems affect all position variables simultaneously and differ from scales in that they also change the appearance of the geometric objects.

  • Coordinate systems control how the axes and grid lines are drawn.

Grammar of Graphics

  • This controls the overall look of the plot

  • Spans every part of the graphic that is not linked to the data

  • Themes give you control over things like fonts, ticks, panel strips, and backgrounds

Grammar of Graphics

Headshot of Dr. Maria Tackett

Demo on ggplot2